Image Analysis Device, Image Analysis System, and Image Analysis Method

ABSTRACT

The purpose of the present invention is to provide an image analysis technique enabling a detection subject to be rapidly detected from image data. This image analysis device generates metadata for a query image containing the detection subject, and using the metadata, narrows down the image data serving as the search subject beforehand and then conducts object detection.

TECHNICAL FIELD

The present invention relates to a technique for detecting specificobjects included in image data.

BACKGROUND ART

Along with development in IT infrastructures for persons/in companies, ahuge amount of multimedia data (such as documents, videos/images,voices, or various log data) has been stored in large storages. In orderto extract information efficiently from vast amount of stored data,various information search techniques for individual media data havebeen invented and put into practical use.

As an example of information search with respect to multimedia data, amethod may be assumed for detecting objects or specific regions includedin images. Object detection or region identification in imagescorrespond to morphological analysis in document analysis (means forseparating documents into words to determine word classes), which areimportant in analyzing meanings of images.

As a method for detecting objects in images, the method of Non PatentLiterature 1 is commonly known, and is commercialized as face regiondetecting function in digital cameras or in monitoring systems. In themethod of Non Patent Literature 1, vast amount of image samples of thedetection target is collected, and multiple of discriminators on thebasis of image brightness are generated by machine learning. Thesediscriminators are combined to generate a determinator for partialregions of the image. The object region is identified by thoroughlysearching the partial regions in the image.

The detection targets are usually human faces currently. However, ifwide ranges of contents stored in storages are the detection targets, itis desired to detect various objects such as cars, animals, buildings,diagrams, or various goods. In addition, in order to process huge sizedata, it is required to improve efficiency of analysis process.

Regarding improvement in efficiency of analysis process, PatentLiterature 1 listed below describes a method for utilizing existenceprobability of objects, thereby limiting the region to which imageprocessing for detecting object regions is performed. The method ofPatent Literature 1 utilizes static information of imaging system suchas focal point distance or resolution, thereby determining regions towhich image processing is performed. It may be advantageous inenvironments where imaging environments or imaging devices are limitedsuch as in-vehicle cameras and where structured data is managed.

CITATION LIST Patent Literature

-   Patent Literature 1: JP Patent Publication (Kokai) 2010-003254 A

Non Patent Literature

-   Non Patent Literature 1: P. Viola and M. Jones, “Robust real-time    object detection”. IJCV2001, Vol. 57, No. 2. pp. 137-154, 2002.

SUMMARY OF INVENTION Technical Problem

The technique described in Patent Literature 1 assumes that the imagingenvironment is specified to some degree and target data for imageprocessing is structured. However, the imaging environment or thephotographic subject is not always predictable in advance generally. Inaddition, in environments where the target data for image processing isgenerated in ad hoc manner, such data is not structured. In suchenvironments, the method described in Patent Literature 1 may not beadvantageous for reducing the time to detect objects.

The technique described in Non Patent Literature 1 is effective if thedetected target is predefined such as in face detection. However, insuch applications where users sequentially specify different detectiontargets, it is individually necessary to collect samples or to performmachine learning, thus not practical in terms of processing time.

The present invention is made in the light of the above-describedtechnical problems. It is an objective of the present invention toprovide image analysis techniques that can rapidly detect the targetswithin image data.

Solution to Problem

An image analysis device according to the present invention: generatesmetadata of query images including the detection target; narrows theimage data for search targets using the metadata in advance; andperforms object detection.

Advantageous Effects of Invention

With the image analysis device according to the present invention, it ispossible to rapidly extract images including any object from vast amountof image data.

Technical problems, configurations, and effects other than mentionedabove will be understood with reference to the following embodiments.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a configuration diagram of an image analysis system 100according to an embodiment 1.

FIG. 2 is a diagram showing a configuration and a data example of animage database 105.

FIG. 3 is a diagram showing a dataflow explaining a sequence to generatemetadata of a query image specified by a user and to narrow an objectdetection target using the metadata.

FIG. 4 is a flowchart showing a process by the image analysis system 100to identify an object region in an image.

FIG. 5 is a diagram showing a sequence by a metadata generator 108 togenerate metadata of the query image.

FIG. 6 is s flowchart showing a process sequence by the metadatagenerator 108 to generate metadata of a query image 301.

FIG. 7 is a diagram showing a detection method for object regions instep S407 in FIG. 4.

FIG. 8 is a flowchart showing a process for an object region detector110 to detect an object.

FIG. 9 is a diagram showing a process sequence between each functionalunits in a process where the image analysis system 100 identifies objectregions in images.

FIG. 10 is a diagram showing a configuration example of an operationalscreen that is used for acquiring images including specified objectsfrom the image database 105.

FIG. 11 is a diagram showing an example extending bibliographicinformation.

FIG. 12 is a flowchart showing a sequence of the process extendingbibliographic information.

FIG. 13 is a Venn diagram showing analyzed targets for explaining aprocess in which detection failure is decreased by extendingbibliographic information.

FIG. 14 is a chart showing a relationship between a processing time ofimage analysis and coverage.

FIG. 15 is a diagram showing a method for increasing accuracy of objectdetection by expanding templates that are used in searching imagessimilar to the query image.

FIG. 16 is a schematic diagram of a content cloud system 1600 accordingto an embodiment 4.

DESCRIPTION OF EMBODIMENTS Embodiment 1 System Configuration

FIG. 1 is a configuration diagram of an image analysis system 100according to an embodiment 1 of the present invention. The imageanalysis system 100 is a system which objective is to search imagesincluding any object specified by a user among vast amount of images.The image analysis system 100 includes an image/document storage device101, an input device 102, a display device 103, a data storage device104, an image database 105, and an image analysis device 106.

The image/document storage device 101 is a storage medium storing imagedata. The image/document storage device 101 may be configured usingstorage systems connected to networks, such as NAS (Network AttachedStorage) or SAN (Storage Area Network). It is assumed that the size ofimage data analyzed by the image analysis system 100 is, for example,several hundreds thousand pieces of image data.

The input device 102 is an input interface for transferring useroperations to the image analysis device 106 such as mouse, keyboard, ortouch device. The display device 103 is an output interface such asliquid crystal display. The display device 103 is used for such asdisplaying image analysis results of the image analysis device 106, orfor interactive operations with users. The data storage device 104 is astorage storing analysis results of the image analysis device 106. Thedata storage device 104 is used for upper applications to utilize theanalysis results.

The image database 105 is a database management system for storingimages. The image database 105 not only temporarily stores analyzed databut also is utilized in the analysis process itself as a dictionary forgenerating metadata. Details will be described later using FIG. 2.

The image analysis device 106 is a device that detects, among image datastored in the image database 105, objects included in the query imagespecified by the user. The image analysis device 106 includes animage/document input unit 107, a metadata generator 108, an analysistarget determinator 109, on object region detector 110, an operationalinformation input unit 111, and a data output unit 112.

The image/document input unit 107 reads out, from the image/documentstorage device 101, image data to be stored in the image database 105and its related bibliographic information. The image/document input unit107 associates these information with each other and stores them intothe image database 105. The image document input unit 107 also reads outthe query image including the detection target object, and passes thequery image to the metadata generator 108 and to the object regiondetector 110.

The metadata generator 108 automatically generates metadata of the queryimage by image recognition process using the image database 105 as adictionary. The metadata mentioned here is highly abstractive dataincluded in image data. For example, metadata is information such aswords describing the image, creation date, or creation location.Hereinafter, for the sake of simplicity, it is assumed that“metadata=word”. However, the metadata generator 108 may generatevarious metadata. A reliability value is assigned to metadata. Thegenerated metadata is sent to the analysis target determinator 109. Thesequence for generating metadata will be described later using FIG. 5.

The analysis target determinator 109 searches the bibliographicinformation stored in the image database 105 using the metadatagenerated by the metadata generator 108 as a search key, therebyacquiring a list of image data having the bibliographic informationmatching with the search key. The metadata used as the search key may beautomatically selected according to its reliability, or may be selectedby the user from candidates of metadata. If the user selects themetadata used as the search key, a list of metadata candidates or thehit number of search result are presented to the user through the dataoutput unit 112 in order to perform an interactive operation between theuser and the image analysis device 106. In addition, search parameterssuch as the metadata specified as the search key or thresholds arereceived from the operational information input unit 111. The image listacquired by above is sent to the object region detector 110 ascandidates for analysis target.

The object region detector 110 identifies, using an image analysisprocess, coordinates of the region from the image where the specifiedobject is present. The detection target is not fixed and may bespecified by the user each time. In addition, objects with variousconcepts (e.g. human face, car, cat, star mark, etc.) may be thedetection target simultaneously. The analysis result is sent to the dataoutput unit 112 as a coordinate of the rectangle region of the object(e.g. [horizontal coordinate of left-top of the rectangle, verticalcoordinate of left-top of the rectangle, horizontal coordinate ofright-bottom of the rectangle, vertical coordinate of right-bottom ofthe rectangle]) and a reliability value indicating “likelihood ofobject”. At this time, the metadata generated by the metadata generator108 may be outputted in association with the analysis result as meaninginformation of the detected object.

The operational information input unit 111 receives a user operationfrom the input device 102, and sends its signal to the image analysisdevice 106. The data output device 112 receives the image list for imageanalysis or image analysis result, and outputs them to the displaydevice 103 and to the data storage device 104.

FIG. 2 is a diagram showing a configuration and a data example of theimage database 105. A configuration example with table format is shownhere. However, any data format is allowed. The image database 105 is adatabase that stores image features and bibliographic informationassociated with each other. The image database 105 includes an image IDfield 1051, an image data field 1052, an image feature field 1053, and abibliographic information field 1054.

The image ID field 1051 stores identifiers of each image data. The imagedata field 1052 is a field that stores image data in binary format, andis used when the user checks the analysis result. The image featurefield 1053 stores numerical vector data with fixed length by whichfeatures of the image itself such as color or shape are quantified. Thebibliographic information field 1054 stores bibliographic information(such as sentence, category, date and time, location) associated withthe image. The bibliographic information field may be separated intomultiple fields if necessary.

Embodiment 1 Operations of Each Units

Overall configurations of the image analysis system 100 have beendescribed so far. Hereinafter, the operational principle of the imageanalysis system 100 will be summarized, and then detailed operations ofeach functional units will be described.

The image analysis system 100 searches, from the database 105 usingimage recognition process, image data in which the object included inthe query image specified by the user is included. In a simplifiedmanner, an object detection process may be performed with respect to allimages in the image database 105. However, the processing speed ofobject detection is usually slow. Thus it is not practical to performobject detection processes with respect to all of vast amount of allimage sets.

For example, it is assumed that 0.5 second of image recognition processis required for each image. Then 140 hours is necessary to analyze onemillion images. If the detected object is limited to “front face ofhuman”, an analyze process is performed only once when building thedatabase and the analysis result may be reused for subsequent analysisprocesses to reduce processing time. However, if the detected object isnot fixed and any specified object is to be detected, it is necessary toperform the analysis process after the user specifies the object to bedetected. Thus the response time may be problematic.

Thus the image analysis system 100 automatically generate the metadataof the detection target object, and attempts to reduce the processingtime by narrowing, using the metadata, the image data to which theobject detection process is performed.

FIG. 3 is a diagram showing a dataflow explaining a sequence to generatemetadata of the query image specified by the user and to narrow theobject detection target using the metadata. As shown in FIG. 3, it isassumed that image data and bibliographic information are already storedin the image database 105.

The query image 301 is the query image inputted by the user through theimage/document input device 107. It is assumed here that only one object(star shape) is present in the query image 301.

The metadata generator 108 generates metadata 302 of the query image 301(S301). The metadata 302 is outputted in a list with scores(=reliabilities of metadata). Details of generation of the metadata 302will be described later using FIG. 5.

The analysis target determinator 109 searches, using the metadatagenerated by the metadata generator 108 as a search key, thebibliographic information matching with the search key from the imagedatabase 105, thereby acquiring a group 303 of image data matching withthe search condition (S302). An example is shown where an OR search ofthree words “star”, “pentagram”, “astral body” is performed. However,AND search may be combined if necessary. In the image having metadatasimilar to the search target image, it is highly likely that there isthe object therewithin.

The object region detector 110 identifies the region of each image inthe image group 303 where objects similar to the object included in thequery image 301 are present (S303). The processing time of this processis increased according to the number of images included in the imagegroup 303. Details of object detection will be described later usingFIG. 7.

The detection result 304 is described, for each of images in the imagegroup 303, along with the number of detected objects, with the locationof the object (dotted rectangle in the detection result 304), and withthe reliability of “likelihood of object” (percentage in the detectionresult 304), for example. As the meaning information of each detectedobjects, the metadata generated in step S301 may be associated with thedetection result 304. The data output unit 112 displays the detectionresult 304 on the display device 103, or outputs the detection result304 as data into the data storage device 104.

As shown in FIG. 3, the image analysis system 100 performs objectdetection after limiting the images that are highly likely to includethe detection target object using the metadata of the query image 301.Thus it is possible to reduce the processing time.

On the other hand, the “appearance” of the image does not always matchwith the bibliographic information in terms of meaning. In the exampleof image 305, the bibliographic information matches with the search keybut the “appearance” of the object is different. In the example of image306, although the image includes an object which “appearance” issimilar, the bibliographic information does not include words matchingwith the condition. The former example increases the processing timebecause of redundant image analysis process. The latter example causesdetection failure. A method for decreasing detection failure will bedescribed in an embodiment 2.

FIG. 4 is a flowchart showing a process by the image analysis system 100to identify the object region in the image. Hereinafter, each step inFIG. 4 will be described.

(FIG. 4: Step S401)

The image/document input unit 107 stores the received image data and thebibliographic information into the image database 105. The imagedatabase 105 extracts the image feature from the image data, and storesthe image feature in association with the bibliographic information. Theimage/document input unit 107 may perform the process for extracting theimage feature. This step may be performed before performing step S402and subsequent steps. It is not necessary to perform this step everytime when performing this flowchart.

(FIG. 4: Steps S402-S403)

The image/document input unit 107 acquires the query image including thedetection target object (S402). The metadata generator 108 generates themetadata of the query image (S403). Details will be described laterusing FIG. 5.

(FIG. 4: Step S404)

The analysis target determinator 109 determines, among the metadatagenerated by the metadata generator 108 in step S403, the metadata thatis used for narrowing the image data to which the object detection isperformed. Specifically, it may be determined mechanically according tothe reliability of the metadata (e.g. automatically selecting from thosewith higher reliabilities within a predetermined range). Alternatively,the metadata may be presented to the user through the data output unit111 and the user may select anyone of the presented metadata.

(FIG. 4: Step S405)

The analysis target determinator 109 searches the bibliographicinformation stored in the image database 105 using the metadata selectedin step S404 as a search key, thereby acquiring a group of image datamatching with the search key. This image group is the target for theobject detection process.

(FIG. 4: Steps S406-S408)

The image analysis device 106 performs step S407 with respect to each ofthe image data included in the image group acquired in step S405. Instep S407, the object region detector 110 extracts, among the imagesincluded in the image group acquired in step S405, the region that issimilar to the object included in the query image. The method forextracting the object region will be described later using FIG. 7.

(FIG. 4: Step S409)

The data output unit 112 outputs the detection result of object regionsdetected by the object region detector 110. The detection result may beoutputted in the processed order. Alternatively, the detection resultmay be outputted after sorting it on the basis of the number of detectedobjects or on the basis of the reliability. Further, as shown in thedetection result 304 in FIG. 3, additional information such as thenumber of detected objects, detection reliability, or the rectangleindicating the detected object region may be outputted along with thedetection result. Yet further, the detection result may be outputted onthe display through the display device 103, or may be outputted in theform of data describing the detection result and the above-mentionedadditional information.

(FIG. 4: Step S410)

If there is not more objects to be detected (if there is no moreinstruction from the user), this flowchart terminates. If there areother objects in the query image, or other objects are to be detectedsuch as when the user newly specifies another query image, the flowchartreturns to step S402 and the same processes will be performed.

FIG. 5 is a diagram showing a sequence by the metadata generator 108 togenerate metadata of the query image. Hereinafter, each step in FIG. 5will be described.

(FIG. 5: Step S501)

The metadata generator 108 searches, using the query image 301 as asearch key, images similar to the search key from the image database105. Similar image search is a method for searching similar images byextracting information such as colors or shapes of the image as highorder vector information, and by evaluating the similarity betweenimages on the basis of the distance between the vectors. As a result, agroup 501 of images which “appearance” are similar to that of the queryimage 301 is acquired. In addition, the image database 105 stores imagesand bibliographic information associated with each other, thus a group502 of bibliographic information is acquired from the group 502 ofsimilar images.

(FIG. 5: Step S502: Sequence 1)

The metadata generator 108 extracts characteristic words included in thebibliographic information group. It is desirable if organized data suchas category codes of images is attached as the bibliographicinformation. Even if documents such as descriptive texts are attached,such documents are highly likely to include characteristic words showingthe meaning of the image. Then in this step, the metadata generator 108separates each of the bibliographic information into atomic data(minimum unit) (e.g. separates from documents to words), and assumesthat the minimum unit is the metadata. Then it is possible to generatethe metadata of the query image 301.

(FIG. 5: Step S502: Sequence 2)

The metadata generator 108 counts the frequency by which the metadatagenerated in the sequence 1 appears in the bibliographic information.The metadata generator 108 calculates, using the appearance frequency, ascore of each of the metadata generated in the sequence 1. Theappearance frequency may be simply sorted, as the score of the metadata,in descending order of the score. Alternatively, an evaluation indicatorin which a weight is attached to the appearance frequency may be used asthe score.

(FIG. 5: Step S502: Example of Calculating the Score No. 1)

TF-IDF (Term Frequency-Inverse Document Frequency) may be used as thescore of the metadata. TF-IDF is an evaluation indicator in which afrequency tf(t) of the metadata t is multiplied by an inverse documentfrequency idf(t). The inverse document frequency idf(t) is calculated byEquation 1 below, where N is the number of records in the database, anddf(t) is the frequency of the bibliographic information including themetadata t among whole of the database.

[Equation  1]                                      $\begin{matrix}{{{idf}(t)} = {\log \frac{N}{{df}(t)}}} & ( {{Equation}\mspace{14mu} 1} )\end{matrix}$

(FIG. 5: Step S502: Example of Calculating the Score No. 2)

A stochastic evaluation indicator may be used as the score of themetadata. For example, when evaluating a metadata t, the scale kl(t) ofthe difference of probability distribution between p(t) and q(t) shownin Equations 2-4 below may be used as the score of the metadata, where:q(t) is a probability by which the metadata t is included in thebibliographic information when randomly acquiring images among whole ofthe database; p(t) is a probability by which the metadata t is includedin the bibliographic information when randomly acquiring images amongthe image group of the similar image search result.

[Equation  2]                                      $\begin{matrix}{{{kl}(t)} = {{{p(t)}\mspace{11mu} \log \frac{p(t)}{q(t)}} + {( {1 - {p(t)}} )\mspace{11mu} \log \; \frac{1 - {p(t)}}{1 - {q(t)}}}}} & ( {{Equation}\mspace{14mu} 2} ) \\{{p(t)} = \frac{{df}^{\prime}(t)}{M}} & ( {{Equation}\mspace{14mu} 3} ) \\{{q(t)} = \frac{{df}(t)}{N}} & ( {{Equation}\mspace{14mu} 4} )\end{matrix}$

df′ (t): frequency of bibliographic information including metadata t inthe similar image search result

M: number of images in the similar image search result

FIG. 6 is s flowchart showing a process sequence by the metadatagenerator 108 to generate the metadata of the query image 301.Hereinafter, each step in FIG. 6 will be described.

(FIG. 6: Steps S601-S602)

The metadata generator 108 calculates the image feature of the queryimage 301 (S601). The metadata generator 108 performs a similar imagesearch using the image feature extracted in step S601 as a search key(S602). The smaller the distance between the feature vectors of each ofthe images is, the higher the similarity between the images is. Thesearch result is sorted according to the distance, and then isoutputted.

(FIG. 6: Steps S603-S607)

The metadata generator 108 performs steps S604-S606 with respect to thesimilar images acquired in step S602.

(FIG. 6: Steps S604-S605)

The metadata generator 108 reads out, from the image database 105, thebibliographic information associated with the similar image acquired instep S602 (S604). The metadata generator 108 separates the bibliographicinformation acquired in step S604 into atomic data, and uses it as themetadata (S605). For example, if the bibliographic information is adocument, a morphology analysis is performed to separate thebibliographic information into words. The process for separating thebibliographic information may be performed in advance when storing thedocument into the image database 105, for the sake of efficiency.

(FIG. 6: Step S606)

The metadata generator 108 counts the frequency by which the metadatagenerated in step S605 appears in the bibliographic information that isread out in step S604. The metadata generator 108 calculates anaccumulated frequency for each of the metadata throughout stepsS603-S607. At this time, in order to reflect the image similarity intothe frequency of the metadata, the similarity may be weighted and thenmay be added to the accumulated frequency.

(FIG. 6: Step S608)

The metadata generator 108 calculates the score of the metadata usingthe accumulated frequency for each of the metadata calculated in stepsS603-S607. The method for calculating the score is as described in FIG.5.

(FIG. 6: Step S609)

The metadata generator 108 sorts the metadata in the order of the scorecalculated in step S608. The metadata generator 108 excludes themetadata below or equal to a threshold, and outputs the sorted result.

FIG. 7 is a diagram showing a detection method for object regions instep S407 in FIG. 4. This method is for detecting regions where theobject is present in the image by using the image of the object to bedetected as a template, and by detecting regions matching with thetemplate.

Firstly, an image feature of a typical image (template) of the object tobe detected is extracted, and then is stored in a template database 704.The template image mentioned here corresponds to the query image 301. Ifit is desired to detect multiple objects, for example, the templatedatabase 704 may store multiple templates (images of the detectiontargets) corresponding to each of the objects. The templates stored inthe template database 704 are reset every time when the object to bedetected changes.

When an input image 701 (an image in the image database 105) that is atarget for object detection is given, the object region detector 110varies location or size of a scan window 702 to extract a candidateregion 703 of the object. Next, the object region detector 110 searchestemplates in the template database 704 which feature vector is theclosest to that of the candidate region 703, for all of the candidateregions 703. If the distance between the feature vector of the foundtemplate and that of the candidate region 703 is at or below apredetermined threshold, it is determined that the candidate region 703includes the object of the template, and the candidate region 703 isadded to the detection result. At this time, the distance between thefeature vector of the closest template and that of the candidate region703 may be used as a reliability of the detection result.

FIG. 8 is a flowchart showing a process for the object region detector110 to detect the object. Hereinafter, each step in FIG. 8 will bedescribed.

(FIG. 8: Step S800)

The object region detector 110 calculates the feature of the template,and stores it into the template database. If there are multiple of theinput images 701 as the object detection target, and if the sametemplate is used to perform the detection process, this step may beperformed only once at the first iteration.

(FIG. 8: Step S801)

The object region detector 110 extracts the candidate region 703 in theinput image 701. The candidate region 703 is mechanically extracted bymoving the scan window for each step or by changing the size of the scanwindow.

(FIG. 8: Steps S802-S806)

The object region detector 110 performs steps S802-S806 for all of thecandidate regions 703.

(FIG. 8: Step S803)

The object region detector 110 calculates the reliability of thecandidate region 703. For example, as described in FIG. 7, the distancebetween the feature of the template and the feature of the candidateregion 703 may be used for calculating the reliability.

(FIG. 8: Steps S804-S805)

If the reliability of the candidate region 703 calculated in step S803is at or below a predetermined threshold, the flowchart proceeds to stepS805, otherwise the flowchart skips step S805 (S804). The object regiondeterminator 110 adds the candidate region 703 with the reliabilityabove the predetermined threshold to the detection result list (S805).

(FIG. 8: Step S807)

The object region detector 110 outputs the detection result list, andthis process flow terminates. The detection result is outputted as a setof coordinate information in the input image 701 (e.g. [horizontalcoordinate of left-top of the rectangle, vertical coordinate of left-topof the rectangle, horizontal coordinate of right-bottom of therectangle, vertical coordinate of right-bottom of the rectangle]) andthe reliability.

FIG. 9 is a diagram showing a process sequence between each functionalunits in a process where the image analysis system 100 identifies objectregions in images. Hereinafter, each step in FIG. 9 will be described.

(FIG. 9: Steps S901-S902)

The user inputs, through the input device 102, images to be stored inthe image database 105 and documents associated with the images (S901).The group of images and documents is sent to the image database 105through the image analysis device 106. The image database 105 extractsfeatures from the images received from the image analysis device 106,and stores the features in association with the bibliographicinformation acquired from the documents (S902). Steps S901-S902correspond to step S401 in FIG. 4.

(FIG. 9: Steps S903-S906)

The user inputs an image (query image) of the object to be detected(S903). The image analysis device 106 requests the image database 105 tosearch similar images using the query image as a search key (S904). Theimage database 105 extracts image features from the query image,searches images similar to the query image using the image features, andresponds the similar image and its bibliographic information to theimage analysis device 106 (S905). The image analysis device 106generates the metadata of the query image using the bibliographicinformation received from the image database 105, and calculates thescore of the metadata (S906).

(FIG. 9: Steps S907-S908)

The image analysis device 106 presents the metadata generated in stepS906 and its score to the user through the display device 103 or throughthe data storage device 104 (S907). The user selects the metadata thatis used for narrowing the images to be searched with reference to themetadata and its score (S908). It is possible for the image analysisdevice 106 to automatically select the metadata such as by omitting stepS908 and by selecting the metadata from higher scores sequentially.

(FIG. 9: Steps S909-S910)

The image analysis device 106 requests the image database 105 to search,using the metadata selected by the user in step S908 as a search key,images which bibliographic information matches with the search key(S909). The image database 105 searches the bibliographic informationcorresponding to the search query, and responds the image associatedwith the bibliographic information to the image analysis device 106(S910).

(FIG. 9: S911)

The image analysis device 106 detects objects included in the queryimage for each of the images acquired in step S910, thereby identifyingthe regions similar to the query image. The detection result isdescribed with coordinates of the rectangle of the object in the image(e.g. [horizontal coordinate of left-top of the rectangle, verticalcoordinate of left-top of the rectangle, horizontal coordinate ofright-bottom of the rectangle, vertical coordinate of right-bottom ofthe rectangle]) and the reliability indicating “likelihood of object”.The detection result is outputted through the data output unit 112.

FIG. 10 is a diagram showing a configuration example of an operationalscreen that is used for acquiring images including specified objectsfrom the image database 105. This screen may be provided on the displaydevice 103. The user sends operational information to the operationalinformation input unit 111 by operating a cursor 1006 displayed in thescreen using the input device 102.

The operational screen in FIG. 10 has a query image input area 1001, asimilar image search button 1002, a metadata generation button 1003, asimilar image display area 1004, a metadata display area 1007, adetection target number display area 1008, a predicted processing timedisplay area 1009, a detection start/stop button 1010, and a detectionresult display area 1011.

The user firstly inputs a query image stored in the image/documentstorage device 101 into the query image input area 1001. A dialogspecifying file paths in the file system may be used, for example.Alternatively, intuitive operations may be used such as drag & drop.

When the user clicks the similar image search button 1002, the imageanalysis device 106 acquires the image similar to the query image fromthe image database 105, and displays it in the similar image displayarea 1004. The image analysis device 106 generates the metadata of thequery image using the bibliographic information of the similar imagedisplayed in the similar image display area 1004. The metadata may begenerated using all of the similar images. Alternatively, the userchecks the similar images and specifies the similar image to be used.The user uses a checkbox 1005 to specify the similar image, for example.In the example shown in FIG. 10, the checkbox of the rightmost similarimage is disabled so that the similar image will not be used whengenerating the metadata.

When the metadata generation button 1003 is clicked, the metadatagenerator 108 generates the metadata using the bibliographic informationassociated with the selected similar image, and displays the generatedmetadata in the metadata display area 1007. The metadata display area1007 also displays the number of images in which each of metadata isincluded in the bibliographic information. If the search speed for thebibliographic information is sufficiently fast, the number of imagessearched by each of metadata alone may also be displayed.

The user selects the metadata that is used for narrowing the images towhich the object detection is performed, considering such as themetadata and its score or the number of images. The user selects themetadata using a checkbox 1012, for example. The image analysis device106 searches the bibliographic information every time when the checkbox1012 is clicked, and displays the number of images in which the selectedmetadata is included in the bibliographic information in the detectiontarget number display area 1008. In addition, the predicted processingtime for performing object detection with respect to that number ofimages is displayed in the predicted processing time display area 1009.The processing time may be approximately calculated according to thenumber of images to which object detection is performed. This enablesthe user to efficiently select the metadata.

When the detection start/stop button 1010 is clicked, the analysistarget determinator 109 acquires, using the metadata selected throughthe above-described operations, a group of images to which objectdetection is performed. The object region detector 110 performs objectdetection with respect to the image group. The detection processesperformed by the object region detector 110 are independent for each ofthe images. Thus the processed images may be displayed in the detectionresult display area 1011 sequentially in the processed order, or theprocess may be started/stopped every time when the detection start/stopbutton 1010 is clicked.

Embodiment 1 Summary

As discussed thus far, the image analysis system 100 according to theembodiment 1 performs object detection only for image data including themetadata of the query image as the bibliographic information. Thisenables efficiently narrowing the target images for object detectionamong vast amount of images, thereby rapidly searching the imagesincluding the object specified by the user.

The image analysis system 100 according to the embodiment 1, for examplein searches or examinations of picture trademarks, may be used whendetermining whether figures that are to be newly registered are used inregistered picture trademarks. In this case, category codes ordescriptive texts may be utilized as the bibliographic information ofthe image required for generating the metadata.

The image analysis system 100 according to the embodiment 1 may beapplied to auction sites or shopping sites. This enables rapidlysearching products including patterns or marks specified by the user. Inthis case, titles of the products or comments from the sellers may beutilized as the bibliographic information of the image.

The image analysis system 100 according to the embodiment 1 may beapplied to video contents. This enables checking scenes in whichcelebrities or landmarks are present and checking the location of suchscenes in the frame image. In this case, closed captions or textedvoices may be utilized as the bibliographic information of the image.

Embodiment 2

In the image analysis system 100 described in the embodiment 1, theanalysis target determinator 109 narrows, by bibliographic informationsearch, the images that are target of object detection. Therefore, evenif the object specified by the user is actually included, images withoutsufficient bibliographic information are not searched and thus are notincluded in the analysis result. Hereinafter, a method for decreasingdetection failure by extending the bibliographic information will bedescribed. Other configurations are approximately the same as those ofthe embodiment 1. Thus hereinafter differences will be mainly described.

FIG. 11 is a diagram showing an example extending the bibliographicinformation. For the sake of comparison, FIG. 11( a) shows a conceptualdiagram of search without extending the bibliographic information andFIG. 11( b) shows a conceptual diagram of search extending thebibliographic information in the embodiment 2.

As shown in FIG. 11( a), the image analysis system 100 described in theembodiment 1 searches the bibliographic information using metadata“star” as the search condition in order to search images including theobject included in the query image 301. As a result, object detectionwill be performed to an image if the bibliographic information includes“star” as for the image 1101. On the other hand, object detection is notperformed to images if the bibliographic information does not include“star” as for the image 1102. However, the image 1102 actually includesregions similar to the query image 301. Thus the object detection failsfor the image 1102.

In the embodiment 2, as shown in FIG. 11( b), the metadata is alsogenerated for images stored in the image database 105. The method forgenerating the metadata may be the same as that for generating themetadata of the query image 301. The newly generated metadata is storedin the image database 105 as additional bibliographic information. Theimage analysis device 106 also searches the additional bibliographicinformation when narrowing the images to which object detection isperformed. This enables searching images that do not include “star” asthe bibliographic information as for the image 1103.

Comparing with images where a single object is present, images wheremultiple objects are present generally have more variations of“appearance” due to change in layout of objects. Thus it is unlikelythat such images are found as images similar to the query image. On theother hand, if an image is found where the similarity to the query imageis high and where multiple objects are present, the informationalquantity will not be significantly degraded even if the bibliographicinformation of the similar image is reused.

FIG. 12 is a flowchart showing a sequence of the process extending thebibliographic information. This flowchart is performed by the metadatagenerator 108 with respect to all images in the image database 105,where steps S1201-S1204 are repeated. This flowchart may be performedwhen the system load is small, for example. Alternatively, thisflowchart may be performed immediately after images are stored into theimage database 105 initially. Hereinafter, each step in FIG. 12 will bedescribed.

(FIG. 12: Step S1202)

The metadata generator 108 generates the metadata of images in the imagedatabase 105 using the existing bibliographic information stored in theimage database 105. The method for generating the metadata is the sameas that shown in FIG. 6. However, the threshold of similarity may bestricter than that of FIG. 6, or image feature may be used that are notchanged even if the object layouts vary.

(FIG. 12: Step S1203)

The metadata generator 108 stores the metadata generated in step S1202into the image database 105 as additional bibliographic information.

FIG. 13 is a Venn diagram showing analyzed targets for explaining aprocess in which detection failure is decreased by extending thebibliographic information. FIG. 13( a) is a Venn diagram using theexisting bibliographic information only. FIG. 13( b) is a Venn diagramusing the extended bibliographic information.

In FIG. 13( a), the group 1301 is a group of all images stored in theimage database 105. If the object specified by the user is searchedwithout using the image analysis system 100, the group 1301 is thetarget for image analysis process.

The group 1302 is a group of images including regions of “star-shapedfigure” specified by the user. It is desirable if the image analysissystem 100 outputs this group.

The group 1303 is an image group acquired by the image analysis system100 by performing bibliographic information search using anautomatically generated metadata “star” is used as a query key. Theimage analysis system 100 performs object detection to this group.

The group 1304 is an image group to which object detection is notperformed because the images include “star-shaped figures” but do notinclude “star” in the bibliographic information, thus not handled asdetection targets.

The group 1305 is an image group that is the target of object detectionand that the object is detectable because the images include“star-shaped figure”. However, it depends on the identificationperformance of the object detector whether the object will actually bedetected. The method for improving the performance of the objectdetector will be described later using FIG. 15 in an embodiment 3.

The group 1306 is an image group which does not require object detectionbecause the images include “star” in the bibliographic information butdo not include regions similar to “star-shaped figure” specified by theuser.

As shown in FIG. 13( b), if the bibliographic information is extended,the group including “star” in the bibliographic information will beenlarged. At this time, since the group is expanded according to theresult of similar image search, it is highly likely that the expandedregion includes “star-shaped figure”. As a result, although theprocessing time for detecting objects will be increased, detectionfailure will be decreased.

FIG. 14 is a chart showing a relationship between a processing time ofimage analysis and coverage. The horizontal axis indicates theprocessing time and the vertical axis indicates the coverage. Coveragemeans the percentage indicating the processed proportion of the group1302 in FIG. 13. In the horizontal axis, the processing time 100 is thetime when all images to be searched, i.e. the group 1302 in FIG. 13, areanalyzed.

In FIG. 14, it is assumed that the proportion occupied by the group 1305among the group 1302 is 60%, and that the proportion occupied by thegroup 1304 among the group 1302 is 40%. By extending the bibliographicinformation, it is assumed that the proportion occupied by the group1305 among the group 1302 will be 80%, and that the proportion occupiedby the group 1304 among the group 1302 will be 20%. In addition, it isassumed that the processing time for performing object detection withrespect to the group 1305 is one-tenth of the processing time forperforming object detection with respect to the group of all images.

The line 1401 represents for the transition of the coverage by which thecoverage changes when all images (the group 1301) are analyzed. Thecoverage linearly increases if images are randomly picked up from theimage database 105.

The polygonal line 1402 represents for the transition of the coverage bywhich the coverage changes if the analyzed targets are narrowed usingthe metadata. The line from the start to the point 1404 indicates thedetection process with respect to the narrowed images. The line afterthe point 1404 indicates the detection process with respect to imagesincluding other images. At the point 1404, it is understood that 60% ofcoverage is achieved by the processing time 1/10 of the line 1401.

The polygonal line 1403 represents for the transition of the coverage bywhich the coverage changes if the analyzed targets are narrowed usingthe extended bibliographic information according to the method of theembodiment 2. The line from the start to the point 1405 indicates thedetection process with respect to the narrowed images. The line afterthe point 1405 indicates the detection process with respect to imagesincluding other images. Although the processing time until the point1405 is increased because of the increased detection targets, thecoverage is improved.

As shown in FIG. 14, the processing time and the coverage are in thetrade-off relationship. Thus it is necessary to determine whether thebibliographic information should be extended, depending on theapplication. When examining picture trademarks, it is sufficient if onesimilar image is found, thus it is preferred to perform detectionprocess after sufficiently narrowing the processed targets to obtainhighly responsive system. If it is desired to improve the coverage, theoriginal bibliographic information may be used first and then additionalbibliographic information may be used if necessary.

Embodiment 2 Summary

As discussed thus far, the image analysis system 100 according to theembodiment 2 generates the metadata of images stored in the imagedatabase 105, adds the generated metadata into the image database 105 asnew bibliographic information, and performs the same process as that ofthe embodiment 1. This enables covering images that are not detected byusing the existing bibliographic information only.

Embodiment 3

In an embodiment 3 of the present invention, a method for improvingaccuracy of object detection by utilizing intermediate data during theprocess of the image analysis system 100. This method uses multipletemplates described in FIG. 7 for detecting objects. Otherconfigurations are approximately the same as those of the embodiments1-2. Thus hereinafter similar image search using multiple templates whengenerating the metadata of the query image will be mainly described.

FIG. 15 is a diagram showing a method for increasing accuracy of objectdetection by expanding templates that are used in searching imagessimilar to the query image. Hereinafter, expanding templates in theembodiment 3 will be described using FIG. 15.

In the object detection method described in FIG. 7, the object region isidentified by checking portions of the image, the templates, and thesimilarity. Thus if only the query image 301 is used as the template, itis impossible to detect star shapes with significantly differentappearances such as the image 1505. In addition, “solar shape” such asthe image 1506 or “planet shape” such as the image 1507 cannot bedetected even if those images have the same concept of “star”.

The image analysis device 106 according to the embodiment 3 uses, asadditional templates, the intermediate data generated during theprocess. Specifically, the image group 1501 acquired as images similarto the query image 301 during generating the metadata of the query image301 is used as the template for object detection. In other words, notonly the object specified by the user but also objects similar to thespecified object are targets for object detection. Thus in theembodiment 3, the template of the target object is expanded usingsimilar images acquired by similar image search performed for generatingthe metadata.

The image analysis device 106 searches images similar to the query image301 according to the method described in FIG. 6 (S601-S602). Theappearances of the image group 1501 do not completely match with thequery image 301. However, since those images are close to the objectspecified by the user, those images may be appropriate as templates whenperforming object detection later. Thus those similar images are storedin a template database 1504. The template database 1504 is a databasethat temporarily stores templates that are used when performing objectdetection. The template database 1504 is reset every time when the queryimage 301 is changed.

The image analysis device 106 generates the metadata of the query image301 and of the image group 1501 according to the method described inFIG. 6. Now it is assumed that metadata “star” is generated, forexample.

The image analysis device 106 searches the bibliographic informationmatching with the metadata “star”. As a result, the image groups1505-1507 including the image 1503 corresponding to the concept of“star” are acquired. The result of bibliographic information search maybe images including multiple objects or may include many noises asdescribed in FIG. 13. Templates may be selected through interactiveoperations with the user using operational screens such as FIG. 10. Forexample, the images acquired by bibliographic information search may bedisplayed on the operational screen, and then the user may select, amongthe displayed images, images that will be used as templates in objectdetection.

The image analysis device 106 performs object detection with respect tothe image groups 1505-1507 using multiple templates stored in thetemplate database 1504. For example, the image group 1501 is used as thetemplate in addition to the query image 301. If the user specifies theimage 1503 as the template on the operational screen, it is further usedas the template. This enables detecting star-shaped regions (e.g. solarshape or Saturn shape such as the image 1503) which appearances are notsimilar to that of the query image 301.

Embodiment 3 Summary

As discussed thus far, the image analysis device 106 according to theembodiment 3 uses, as expanded templates for object detection, similarimages acquired when generating the metadata of the query image 301 orimages acquired when searching the bibliographic information. Thisenables detecting objects which concept is the same but which“appearance” is different.

Embodiment 4

In an embodiment 4 of the present invention, a configuration examplewill be described where the image analysis system 100 is embedded into acontent cloud system. Hereinafter, a summary of the content cloud systemwill be described first. Then a method for embedding the image analysissystem 100 into the content cloud system as an analysis module will bedescribed. The configuration of the image analysis system 100 is thesame as that of the embodiments 1-3.

FIG. 16 is a schematic diagram of a content cloud system 1600 accordingto the embodiment 4. The content cloud system 1600 includes an ExtractTransform Load (ETL) module 1603, a content storage 1604, a searchengine 1605, a metadata server 1606, and a multimedia server 1607. Thecontent cloud system works on common computers comprising one or moreCPUs, memories, and storage devices. The system itself comprises variousmodules. Each of the modules may be executed on independent computers.In such cases, each of storages and modules are connected by networks.The system is implemented by distributed process performing datacommunications through the network.

The application program 1608 sends a request to the content cloud system1600 through the network. The content cloud system 1600 sends, to theapplication 1608, information according to the request.

The content cloud system 1600 receives, as inputs, any form of data 1601such as video data, image data, document data, or voice data. The data1601 is, for example, picture trademark and its gazette document,website images and HTML documents, or video data with closed captions orwith voices. The data 1601 may be structured data or non-structureddata. The data inputted to the content cloud system 1600 is temporarilystored in the storage 1602.

The ETL 1603 monitors the storage 1602. When the data 1601 is storedinto the storage 1602, the ETL 1603 activates the information extractionmodule 16031. The extracted information (metadata) is archived in thecontent storage 1604.

The information extraction module 16031 includes, for example, a textindex module and an image identification module. Examples of metadatamay be such as time, N-gram index, image recognition result (objectname, region coordinate in image), image feature and its related word,or voice recognition result. Any program that extracts some information(metadata) may be used as the information extraction module 16031. Sincecommonly known techniques may be employed, details of the informationextraction module 16031 are omitted here. If necessary, the metadata maybe compressed using data compression algorithm. After the ETL 1603extracts the information, information such as data filename, dataregistration date, original data type, or metadata text information maybe stored in Relational Data Base (RDB).

The content storage 1604 stores the information extracted by the ETL1603 and the data 1601 before processed that is temporarily stored inthe storage 1602.

When requested from the application program 1608, the search engine 1605performs text search according to the index created by the ETL 1603 inthe case of text search, and sends the search result to the applicationprogram 1608. Commonly known techniques may be applied as the algorithmof the search engine 1605. The search engine 1605 may include not onlymodules for searching texts but also modules searching data such asimages or voices.

The metadata server 1606 manages the metadata stored in the RDB. Forexample, it is assumed that data filenames, data registration dates,original data types, or metadata text information extracted by the ETL1603 are stored in the RDB. When requested from the application 1608,the metadata server 1606 sends the information in the RDB according tothe request.

The multimedia server 1607 associates the metadata information extractedby the ETL 1603 with each other, and stores the meta information instructured manner of graph format. An example of association is: withrespect to a voice recognition result of “apple” stored in the storage1604, a relationship between original voice file, image data, or relatedwords may be represented in network format. When requested from theapplication 1608, the multimedia server 1607 sends corresponding metainformation to the application 1608. For example, if a request of“apple” is received, the metadata server 1607 sends, according to thebuilt graph structure, meta information related in the network graphsuch as images including apple, average price, or song name of artist.

In the content cloud system 1600, the image analysis system 100 works asthe information extraction module 16031 in the ETL 1603. Theimage/document storage device 101 and the data storage device 104 inFIG. 1 correspond to the storage 1602 and the content storage 1604 inFIG. 16 respectively. The image analysis device 106 corresponds to theinformation extraction module 16031. If multiple of the informationextraction module 16031 is embedded into the ETL 1603, resources of onecomputer may be shared or independent computers may be used for each ofthe modules. The image database 105 in FIG. 1 corresponds to thedictionary data 16032 that is required by the ETL 1603 to extractinformation.

Embodiment 4 Summary

As discussed thus far, the image analysis system 100 according to thepresent invention may be applied to components of the content cloudsystem 1600. The content cloud system 1600 may integrate informationacross media by generating metadata that is commonly usable among eachmedia data. This enables providing highly valued information to users.

The present invention is not limited to the embodiments, and variousmodified examples are included. The embodiments are described in detailto describe the present invention in an easily understood manner, andthe embodiments are not necessarily limited to the embodiments thatinclude all configurations described above. Part of the configuration ofan embodiment can be replaced by the configuration of anotherembodiment. The configuration of an embodiment can be added to theconfiguration of another embodiment. Addition, deletion, and replacementof other configurations are also possible for part of the configurationsof the embodiments.

The configurations, the functions, the processing units, the processingmeans, etc., may be realized by hardware such as by designing part orall of the components by an integrated circuit. A processor mayinterpret and execute programs for realizing the functions to realizethe configurations, the functions, etc., by software. Information, suchas programs, tables, and files, for realizing the functions can bestored in a recording device, such as a memory, a hard disk, and an SSD(Solid State Drive), or in a recording medium, such as an IC card, an SDcard, and a DVD.

REFERENCE SIGNS LIST

-   100: image analysis system-   101: image/document input device-   102: input device-   103: display device-   104: data storage device-   105: image database-   106: image analysis device-   107: image/document input unit-   108: metadata generator-   109: analysis target determinator-   110: object region detector-   111: operational information inout unit-   112: data output unit-   1600: content cloud system-   1602: storage-   1603: ETL module-   1604: content storage-   1605: search engine-   1606: metadata server-   1607: multimedia server-   1608: application program

1. An image analysis device comprising: an image input unit thatreceives query image data including an image of an object to besearched; a metadata generator that generates metadata of the queryimage data using an image database storing image data and itsbibliographic information; an analysis target determinator that extractsone or more of the image data stored in the image database, theextracted image data having the bibliographic information matching withthe metadata; an object region detector that detects, in one or more ofthe image data extracted by the analysis target determinator, a regionincluding an image of the object; and an output unit that outputs aresult detected by the object region detector.
 2. The image analysisdevice according to claim 1, wherein the metadata generator searches theimage data stored in the image database, the searched image data beingsimilar to the query image data, and wherein the metadata generatorgenerates the metadata using the bibliographic information of thesearched image data.
 3. The image analysis device according to claim 2,wherein the metadata generator calculates a score of the metadata usinga frequency by which the metadata appears in the bibliographicinformation of the image data acquired by the search, and wherein theanalysis target determinator determines, using the score, the metadatathat is used as a search key when extracting the image data matchingwith the bibliographic information.
 4. The image analysis deviceaccording to claim 3, wherein the analysis target determinator extractsthe image data associated with the bibliographic information matchingwith the metadata, using the metadata within a predetermined rangesequentially from higher one of the scores as a search key.
 5. The imageanalysis device according to claim 3, wherein the analysis targetdeterminator receives a metadata designation specifying the metadataused for extracting the image data matching with the bibliographicinformation, and wherein the analysis target determinator extracts theimage data that is associated with the bibliographic informationmatching with the specified metadata.
 6. The image analysis deviceaccording to claim 5, wherein the image analysis device includes adisplay unit that displays a number of the image data which is a targetfor the object region detector to detect a region including an image ofthe object, and that displays a processing time for the object regiondetector to perform the detection, and wherein the analysis targetdeterminator recalculates the number of the image data and theprocessing time every time when receiving the metadata designation, andreflects a result of the recalculation on the display unit.
 7. The imageanalysis device according to claim 2, wherein the metadata generatorreceives a similar image designation specifying the image data among thesearched image data that is used for generating the metadata along withthe query image, wherein the metadata generator searches, among theimage data stored in the image database, the image data similar to thequery image data and the image data specified by the similar imagedesignation, and wherein the metadata generator generates the metadatausing the bibliographic information of the image data acquired by thesearch.
 8. The image analysis device according to claim 1, wherein theobject region detector calculates a distance between a feature vector ofa partial region of the image data and a feature vector of the queryimage data, and wherein the object region detector determines, dependingon whether the distance is within a predetermined range, whether theobject included in the query image data is included in the partialregion.
 9. The image analysis device according to claim 1, wherein theoutput unit outputs, along with a result detected by the object regiondetector, a number of the object detected by the object region detectorin the image data.
 10. The image analysis device according to claim 1,wherein the output unit outputs, along with a result detected by theobject region detector, a detection reliability of the object detectedby the object region detector in the image data.
 11. The image analysisdevice according to claim 1, wherein the metadata generator generatesmetadata of one of the image data stored in the image database usingother one of the image data stored in the image database, and adds thegenerated metadata as the bibliographic information, and wherein theanalysis target determinator extracts, using the bibliographicinformation where the metadata is added, the image data among the imagedata stored in the image database, the bibliographic information of theextracted image data matching with the metadata.
 12. The image analysisdevice according to claim 1, wherein the object region detector detects,among one or more of the image data extracted by the analysis targetdeterminator, a region including an image of the object and a regionincluding an image of an object included in the image data acquired bythe metadata generator in the search.
 13. The image analysis deviceaccording to claim 12, wherein the object region detector receives adetection target designation specifying, among the image data acquiredby the metadata generator in the search, the image data including anobject that is to be detected along with the object included in thequery image data, and wherein the object region detector detects, amongone or more of the image data extracted by the analysis targetdeterminator, a region including an image of the object and a regionincluding an image of an object included in the image data specified bythe detection target designation.
 14. An image analysis systemcomprising: the image analysis device according to claim 1; and an imagedatabase that stores image data and its bibliographic informationassociated with each other, wherein the metadata generator generatesmetadata of the query image data using the image database.
 15. An imageanalysis method comprising: an image input step receiving query imagedata including an image of an object to be searched; a metadatageneration step generating metadata of the query image data using animage database storing image data and its bibliographic information; ananalysis target determination step extracting one or more of the imagedata stored in the image database, the extracted image data having thebibliographic information matching with the metadata; an object regiondetection step detecting, in one or more of the image data extracted inthe analysis target determination step, a region including an image ofthe object; and an output step outputting a result detected in theobject region detection step.